Word | Frequency | Number of right neighbors | Number of left neighbors | Ratio |
---|---|---|---|---|
O | 4229 | 313 | 3 | 104.3333 |
No | 799 | 52 | 1 | 52.0000 |
Em | 623 | 50 | 1 | 50.0000 |
A | 3095 | 248 | 8 | 31.0000 |
Os | 997 | 77 | 3 | 25.6667 |
As | 585 | 51 | 2 | 25.5000 |
Esta | 230 | 19 | 1 | 19.0000 |
Por | 267 | 19 | 1 | 19.0000 |
Não | 469 | 31 | 2 | 15.5000 |
mas | 1355 | 61 | 4 | 15.2500 |
Já | 227 | 15 | 1 | 15.0000 |
Apartamento | 676 | 14 | 1 | 14.0000 |
E | 440 | 27 | 2 | 13.5000 |
seu | 930 | 96 | 9 | 10.6667 |
sua | 1030 | 92 | 9 | 10.2222 |
Com | 272 | 10 | 1 | 10.0000 |
Foi | 182 | 10 | 1 | 10.0000 |
Se | 318 | 19 | 2 | 9.5000 |
nesse | 63 | 9 | 1 | 9.0000 |
mortais | 81 | 9 | 1 | 9.0000 |
Word | Frequency | Number of right neighbors | Number of left neighbors | Ratio |
---|---|---|---|---|
Type | 336 | 1 | 45 | 0.0222 |
milhões | 568 | 2 | 35 | 0.0571 |
conferência | 144 | 1 | 8 | 0.1250 |
cerca | 419 | 2 | 15 | 0.1333 |
greve | 50 | 1 | 7 | 0.1429 |
falta | 114 | 1 | 7 | 0.1429 |
Reino | 133 | 1 | 7 | 0.1429 |
consumo | 53 | 1 | 7 | 0.1429 |
Vitória | 134 | 1 | 7 | 0.1429 |
casas | 117 | 1 | 7 | 0.1429 |
possibilidade | 99 | 1 | 6 | 0.1667 |
território | 61 | 1 | 6 | 0.1667 |
restantes | 52 | 1 | 6 | 0.1667 |
campos | 48 | 1 | 6 | 0.1667 |
distribuição | 41 | 1 | 6 | 0.1667 |
maiores | 70 | 1 | 6 | 0.1667 |
vontade | 59 | 1 | 6 | 0.1667 |
Transação | 297 | 1 | 6 | 0.1667 |
continuar | 127 | 2 | 11 | 0.1818 |
toda | 183 | 2 | 11 | 0.1818 |
In this subsection, we compute the ratio of the number of right neighbors and the number of left neighbors. Again, we look for words with extreme ratios:
Data for first table:
select word,w.freq,aa.cnt, bb.cnt,aa.cnt/bb.cnt as r from words w, (select w1_id,count(c.w2_id) as cnt from co_n c where w1_id>100 group by w1_id) aa, (select w2_id,count(c.w1_id) as cnt from co_n c where w2_id>100 group by w2_id) bb where w_id=aa.w1_id and aa.w1_id=bb.w2_id order by r desc limit 20;
Diagram data:
select aa.cnt, bb.cnt from (select w1_id,count(c.w2_id) as cnt from co_n c where w1_id>100 group by w1_id) aa, (select w2_id,count(c.w1_id) as cnt from co_n c where w2_id>100 group by w2_id) bb where aa.w1_id=bb.w2_id;
5.1.7.1 Number of NN co-occurrences vs. Frequency I
5.1.7.2 Number of NN co-occurrences vs. Frequency II